********************************************************************************
*Replication code for:
*Participation Incentives in a Survey of International Non-Profit Professionals.
*Safarpour, Bush, and Hadden (2022).
********************************************************************************


*********************************************************************************************
	*Response Rates by condition. 
		*(First numbers correspond to disposition codes, items after colon refer to survey data.)
		**1.1 complete: progress=100 and/or finished=1
		**1.2 partially complete: progress<100 but >0
		
		**2.111 explicit refusal: R replies saying they don't want to participate.
		**2.112 implicit refusal: progress=0 or progress=6 (did not get past the consent form).
		
		**3.30 undeliverable: email failed to be delivered.
		**3.40 forwarding information is obtained.
		**3.19 unknown whether invitation reached the respondent. No info received.
		**4.10 ineligible due to screener Q.
		**4.81 ineligibale due to duplicate listings.
*********************************************************************************************
	
*Import survey data.
import delimited "ParticipationIncentivesDataAnonymized.csv", delimiter(comma) varnames(1) clear 


	*One recipient was sent a custom email that did not have an embedded data field. As a result, 
		**none of the conjoint blocks were filled out.
		**We drop this case for the data quality analyses but include him as completing the survey
		**for the response rates calculations. He was assigned to the control group.
		drop if responseid=="R_20Z3OdQH2Wn2QtZ"			

	*View finished status and progress by screener and experimental condition.	
	bysort condition: tab finished screener, missing
	bysort condition: tab progress screener, missing

	bysort condition: tab progress finished, missing

	**1.1 complete: progress=100 and/or finished=1
		***control=19 (adds the one case discussed above).
		***treat1=29
		***treat2=20
		***treat3=52
		
	**1.2 partially complete: progress<100 but >0
		***control=21
		***treat1=18
		***treat2=19
		***treat3=21
		
	**4.10 ineligible due to screener Q.
		***control=31
		***treat1=35
		***treat2=39
		***treat3=56
		
	**2.112 implicit refusal: progress=0 or progress=6 (did not get past the consent form).
		***control=4
		***treat1=3
		***treat2=6
		***treat3=4
		

*Import Distribution History data.

	**Treat2.
	import delimited "ParticipationIncentivesDistributionHistoryTreat2Anonymized.csv", delimiter(comma) varnames(1) clear 

		*View status.
		tab status, missing

	**Treat1.
	import delimited "ParticipationIncentivesDistributionHistoryTreat1Anonymized.csv", delimiter(comma) varnames(1) clear 

		*View status.
		tab status, missing
		
	**Control.
	import delimited "ParticipationIncentivesDistributionHistoryControlAnonymized.csv", delimiter(comma) varnames(1) clear 
		
		*View status.
		tab status, missing		

	**Treat3.
	import delimited "ParticipationIncentivesDistributionHistoryTreat3Anonymized.csv", delimiter(comma) varnames(1) clear 
		
		*View status.
		tab status, missing
		
	**2.111 explicit refusal: R replies saying they don't want to participate.
		
		**Distribution-History, status=Opted Out
		
			**control=41
			**treat1=45
			**treat2=43
			**treat3=35
	
	**3.30 undeliverable: email failed to be delivered.
		
		**Distribution-History, status=Email failed/ Email Bounced/ skipped as duplicate.
		
			**control=16+2580+19=2615
			**treat1=2524+14+17=2555
			**treat2=2592+11+15=2618 
			**treat3=2546+12+15=2573
			
	**3.19 unknown whether invitation reached the respondent. No info received.
		**Distribution-History, status=Email Sent.
			**control=2104 
			**treat1=2157 
			**treat2=2091 
			**treat3=2105		
	
	*Total Sample Used:
		**Distribution History, n obs.
			**control=4833+1=4834
			**treat1=4841
			**treat2=4835
			**treat3=4845

***Numbers are plugged into the AAPOR Response Rate calculator (ResponseRatesResults.xlsx) to determine RR across conditions.
**Distribution History, n obs.
			**control=4833+1=4834
			**treat1=4841
			**treat2=4835
			**treat3=4845 
			
**Two sample test of proportions.
	*[n rate n rate]
	*Control v Treat1 using RR2.
	prtesti 4834 0.008 4841 0.010 

		**n.s.
	*Control v Treat2 using RR2.
	prtesti 4834 0.008 4835 0.008
			**n.s.
	*Control v Treat3 using RR2.
	prtesti 4834 0.008 4845 0.015
		**Only the Treat3 significantly increased RR2 relative to the control.
		**The 0.7 percentage point difference between the control and treatment is significant (p<0.05).
	
	*Control v Treat1 using RR1.
	prtesti 4834 0.004 4841 0.006 
	*Control v Treat2 using RR1.
	prtesti 4834 0.004 4835 0.004
	*Control v Treat3 using RR1.
	prtesti 4834 0.004 4845 0.011
		**Treat3 significantly increased RR1 relative to the control.
		**The 0.7 percentage point difference between the control and treatment is statistically significant at conventional levels.
		**Control v. Treat1, 0.2 percentage point effect is stat significant at the 90% level (Ha: (Control-Treatment1)<0. p=0.08, one-tailed).
	
	
	*Control v Treat1 using RR3.
	prtesti 4834 0.005 4841 0.008 
		**diff of 0.003 one-tailed p=0.0332 (Ha: diff<0); two tailed p= 0.066 (Ha: difference is not 0)
	*Control v Treat2 using RR3.
	prtesti 4834 0.005 4835 0.006
		**n.s.
	*Control v Treat3 using RR3.
	prtesti 4834 0.005 4845 0.016
		**Treat3 significantly increased RR3 relative to the control. 1.1 % points, p=0.
	
	*Control v Treat1 using RR4.
	prtesti 4834 0.011 4841 0.013 
		**n.s.
	*Control v Treat2 using RR4.
	prtesti 4834 0.011 4835 0.012
		**n.s.
	*Control v Treat3 using RR4.
	prtesti 4834 0.011 4845 0.023
		**Only the Treat3 significantly increased RR4 relative to the control at conventional levels. Difference of 1.2 points.
		/*Using this multiple comparisons app, https://egap.shinyapps.io/multiple-comparisons-app/
		and one-tailed p-values for RR4 (0.1832, 0.3224, and 0.0001) calculated above, alpha=0.05,
		and the Benjamini-Hochberg procedure, the p-value for the last comparison remains <0.001*/

**COOPERATION RATES- Two sample test of proportions.
	*[n rate n rate]
	*Control v Treat1 using CR1.
	prtesti 85 0.224 95 0.305
	*Control v Treat1 using CR2.
	prtesti 85 0.471 95 0.495
	*Control v Treat1 using CR3.
	prtesti 85 0.224 95 0.305
	*Control v Treat1 using CR4.
	prtesti 85 0.471 95 0.495

	*Control v Treat2 using CR1.
	prtesti 85 0.224 88 0.227
	*Control v Treat2 using CR2.
	prtesti 85 0.471 88 0.443
	*Control v Treat2 using CR3.
	prtesti 85 0.224 88 0.227
	*Control v Treat2 using CR4.
	prtesti 85 0.471 88 0.443

	*Control v Treat3 using CR1.
	prtesti 85 0.224 112 0.464
	*Control v Treat3 using CR2.
	prtesti 85 0.471 112 0.652
	*Control v Treat3 using CR3.
	prtesti 85 0.224 112 0.464
	*Control v Treat3 using CR4.
	prtesti 85 0.471 112 0.652
	
*Import survey data.
import delimited "ParticipationIncentivesDataAnonymized.csv", delimiter(comma) varnames(1) clear 

	*One recipient was sent a custom email that did not have an embedded data field. As a result, 
		**none of the conjoint blocks were filled out.
		**We drop this case for the data quality analyses but include him as completing the survey
		**for the response rates calculations. He was assigned to the control group.
		tab condition if responseid=="R_20Z3OdQH2Wn2QtZ", missing
		drop if responseid=="R_20Z3OdQH2Wn2QtZ"			
	
	tab condition, missing
	drop if condition==.
	tab condition, missing

	*Drop those who failed/ did not answer the screener.
		*drop if screener!="1"
				tab screener, missing
				drop if screener!="Yes"
				*Yields n=198.

	*Progress.
		ta progress, missing
		sort progress
		
		*drop obs where progress<=20
		drop if progress<=20
			**new n=171.
		
*Speeding Analysis: Response Times by condition.

		*using durationinseconds timer.
				summarize durationinseconds
				describe durationinseconds

				list durationinseconds in 1/10
				
		bysort condition: summarize durationinseconds, detail		
				
		
	*Appendix Figure 11- OLS regression for total survey duration in seconds (entire survey). 
		reg durationinseconds i.condition
		margins i.condition
		*Set Font to Times New Roman.
		graph set window fontface "Times New Roman"
		*Set graph color scheme to black and white.
		set scheme lean1
		marginsplot, recast(scatter) xtitle("Condition") ytitle("Average Total Completion Time in Seconds") ///
		title("") xlabel(0 "Control" 1 "Treat 1: Altruistic" 2 "Treat 2: Data" 3 "Treat 3: Monetary") plotopts(msymbol(o) ) ///
		note("Note: Average completion time in seconds by condition with 95% confidence intervals calculated using OLS regression." "Excludes respondents where progress<20%.")
		
*Speeders.
		**speeder=1 if average time per page<300 msec/word
		**Survey Part 1= 874 words*300msec/word=262200 msec or 262.2 seconds (about 4.37 minutes).
		
		reg speederpart1 i.condition
		margins i.condition
		
		**Survey Part 2= 1347*300msec/word= 404100 msec or 404.1 seconds (6.74 minutes)
		
		probit speederpart2 i.condition
		margins i.condition
		
		reg speederpart2 i.condition
		margins i.condition
		
		**Survey Part 3= 258 words*300 msec/word=77400 msec or 77.4 seconds (1.29 minutes)
		
		probit speederpart3 i.condition
		margins i.condition
		
		reg speederpart3 i.condition
		margins i.condition
			
*speeder counts.
		ta speeder_count condition, col chi2
		reg speeder_count i.condition
		margins i.condition
		marginsplot

		***drop if totalduration is more than 1.5*IQR and re-run all regressions.
		summarize q_totalduration, detail		
		quietly summarize q_totalduration, detail
		scalar per25=r(p25)
		scalar per75=r(p75)
		scalar list
		gen outlier= 1.5* (per75-per25)
		sum outlier
					**1.5*IQR= about 42 min.
		drop if q_totalduration>outlier
				
		reg q_totalduration i.condition
		margins i.condition
		reg timerpart1 i.condition		
		reg timerpart2 i.condition
		reg timerpart3 i.condition
		

	bysort condition: sum timerpart2minute, detail
		bysort condition: sum timerpart2minute
		sort timerpart2minute

*Appendix Figure 9- Graph average completion time per survey part by condition.
	graph set window fontface "Times New Roman"
	set scheme lean1
		graph bar timerpart1minute timerpart2minute timerpart3minute, ///
		over(condition, relabel(1 "Control" 2 "Treat 1: Altruistic" 3 "Treat 2: Data" 4 "Treat 3: Monetary"))  ///
		legend(label (1 "Part I") label (2 "Part II") label (3 "Part III")) ///
		ytitle("Average Completition Time in Minutes") ///
		blabel(total, format(%9.1f)) ///
		note("Note: Excludes progress<20%, total completion time>= 1.5*IQR, and part 2 completition time>33 minutes.") 

*Appendix fig 10- Graph average conjoint completition time by conidition.
	graph set window fontface "Times New Roman"
	set scheme lean1
		graph bar timerconjoint1minute timerconjoint2minute timerconjoint3minute timerconjoint4minute timerconjoint5minute, ///
		over(condition, relabel(1 "Control" 2 "Treat 1: Altruistic" 3 "Treat 2: Data" 4 "Treat 3: Monetary"))  ///
		legend(label (1 "Conjoint 1") label (2 "Conjoint 2") label (3 "Conjoint 3") label (4 "Conjoint 4") label (5 "Conjoint 5")) ///
		ytitle("Average Completion Time in Minutes") ///
		blabel(total, format(%9.1f)) ///
		note("Note: Excludes progress<20% and total completion time>= 1.5*IQR.") 
		
		*Appendix Tukey HSD corrections.
		reg timerpart1minute i.condition
		pwmean timerpart1minute, over(i.condition) mcompare(tukey) effects
		
		reg timerpart2minute i.condition
		pwmean timerpart2minute, over(i.condition) mcompare(tukey) effects

*Proportion of speeders for each survey part by condition.
ta speederpart1 condition,col
ta speederpart2 condition,col
ta speederpart3 condition,col

*main text Figure 3 and Appendix figure 8.
		graph bar speederpart1 speederpart2 speederpart3, ///
		over(condition, relabel(1 "Control" 2 "Treat 1: Altruistic" 3 "Treat 2: Data" 4 "Treat 3: Monetary")) ///	 					
		legend(label (1 "Part 1") label (2 "Part 2") label (3 "Part 3")) ///
		ytitle("Proportion of Speeders") ///
		blabel(total, format(%9.2f)) ///
		note("Note: Excludes progress<20%, total completion time>= 1.5*IQR. Speeders are those who finished " "the section faster than average of 300 msec per word.") 
		
		*speeders treat 2 v treat 3 (part 2)
		prtesti 21 .2381 55 .2727
		*speeders treat 2 v treat 3 (part 1)
		prtesti 21 .0476 55 .1636
		*speeders treat 2 v treat 3 (part 3)
		prtesti 21 0 55 .0545
		
		*speeders control v treat 3 (part 1)
		prtesti 24 .0417 55 .1636
		*speeders control v treat 3 (part 2)
		prtesti 24 .2083 55 .2727
		*speeders control v treat 3 (part 3)
		prtesti 24 .0417 55 .0545
		
	*Probit models predicting speeding (Appendix Table 5).
			probit speederpart1 i.condition
				*get AIC for model results.
				estat ic
			
			probit speederpart2 i.condition
				*get AIC for model results.
				estat ic

			probit speederpart3 i.condition
				*get AIC for model results.
				estat ic

********************************************************************
*Straightlining Analysis.
********************************************************************

		nbreg straightlinedcount i.condition
		margins i.condition
		*Figure 1 (main text) and Appendix Figure 4- Expected Count Figure*
		graph set window fontface "Times New Roman"
		set scheme lean1
		marginsplot, recast(scatter) xtitle("Condition") ytitle("Predicted Count of Straightlined Questions") ///
		title("") xlabel(0 "Control" 1 "Treat 1: Altruistic" 2 "Treat 2: Data" 3 "Treat 3: Monetary") plotopts(msymbol(o) ) ///
		note("Note: Expected counts calculated using results from negative binomial regression. Error bars are 95% confidence intervals" "calculated using margins in Stata v.15.1. Excludes respondents where progress<20% or total completion time>= 1.5*IQR.") 

********************************************************************
*Item non-response: total number of questions R leaves blank across conjoint blocks.
********************************************************************

*Import survey data.
import delimited "ParticipationIncentivesDataAnonymized.csv", delimiter(comma) varnames(1) clear 

	*One recipient was sent a custom email that did not have an embedded data field. As a result, 
		**none of the conjoint blocks were filled out.
		**We drop this case for the data quality analyses but include him as completing the survey
		**for the response rates calculations. He was assigned to the control group.
		tab condition if responseid=="R_20Z3OdQH2Wn2QtZ", missing
		drop if responseid=="R_20Z3OdQH2Wn2QtZ"			
	
	drop if condition==.
	tab condition, missing

	*Drop those who failed/ did not answer the screener.
	drop if screener!="Yes"
	*Yields n=198.
	
	*Figure 2-Main text.
	graph set window fontface "Times New Roman"
	set scheme lean1	
	nbreg nmisconjoints i.condition
	margins i.condition
	marginsplot,recast(scatter) xtitle("Condition") ytitle("Predicted Count of Skipped Questions") ///
	title("") xlabel(0 "Control" 1 "Treat 1: Altruistic" 2 "Treat 2: Data" 3 "Treat 3: Monetary") plotopts(msymbol(o) ) 
		
*Appendix Figure 5.	
		graph set window fontface "Times New Roman"
		set scheme lean1
	graph bar nmisconjoints, ///
		over(condition, relabel(1 "Control" 2 "Treat 1: Altruistic" 3 "Treat 2: Data" 4 "Treat 3: Monetary")) ///
		blabel(total, format(%9.2f)) ///
		ytitle("Mean Skipped Conjoint Questions") ///
		note("Note: Total conjoint questions=25.") 
		
	*means.
	ta condition, sum(nmisconjoints)
	mean nmisconjoints, over(condition)
	
	*Appendix Figure 6.
	reg nmisconjoints i.condition
	margins i.condition
	marginsplot,recast(scatter) xtitle("") ytitle("Mean Skipped Conjoint Questions") ///
		title("") xlabel(0 "Control" 1 "Treat 1: Altruistic" 2 "Treat 2: Data" 3 "Treat 3: Monetary") plotopts(msymbol(o) ) ///
		note("Note: OLS estimates. Error bars are 95% confidence intervals.") 
	
	*Appendix Figure 7- Proportion of missed conjoints out of total (OLS)-pre-registered analysis.	
	gen propmissed=nmisconjoints/25
	sum propmissed, detail
	reg propmissed i.condition
	margins i.condition
	marginsplot, recast(scatter) xtitle("") ytitle("Proportion of Skipped Conjoint Questions") ///
	title("") xlabel(0 "Control" 1 "Treat 1: Altruistic" 2 "Treat 2: Data" 3 "Treat 3: Monetary") plotopts(msymbol(o) ) ///
	note("Note: Proportion of skipped conjoint questions out of total. OLS estimates. Error bars are 95% confidence intervals.") 

*************************************************************		
*Appendix results for Sample v Overall Population Comparison.	
*************************************************************		
	
import delimited "AppendixData.csv", delimiter(comma) varnames(1) clear
			
			*Median founded year in full sampling frame.
			sum founded2, detail
			
*founded year per condition.
bysort condition: sum founded2, detail

*overall missings.
ta founded2, missing 

*t test for whether responders versus non-responders differ in founded year.
ttest founded2,by(responded) unequal

*two sample t tests (n1 mean1 sd1 n2 mean2 sd2)
*mean founded overall versus in control.
ttesti 612 1993.739  23.13534 2 2003.5 4.949747

*mean founded overall versus treat1.
ttesti 612 1993.739  23.13534 8 2002 11.40175

*mean founded overall versus treat2.
ttesti 612 1993.739  23.13534 7 2000.429  11.17821

*mean founded overall versus treat3.
ttesti 612 1993.739  23.13534 10 2002.4  9.191784

ta region,missing
ta region condition, missing col

*diff in proportions tests. n1 prop1 n2 prop2

*Missings.
*overall sample vs control missing.
prtesti 19336 0.1101 72 0.0694

*overall sample vs treat1 missing.
prtesti 19336 0.1101 84 0.0833

*overall sample vs treat2 missing.
prtesti 19336 0.1101 84 0.0833

*overall sample vs treat3 missing.
prtesti 19336 0.1101 132 0.1061

*Midwest.
*overall sample vs control.
prtesti 19336 0.0847 72 0.0417

*overall sample vs treat1.
prtesti 19336 0.0847 84 0.1310

*overall sample vs treat2.
prtesti 19336 0.0847 84 0.1190

*overall sample vs treat3.
prtesti 19336 0.0847 132 0.0909

*Northeast.
*overall sample vs control.
prtesti 19336 0.3607 72 0.3472

*overall sample vs treat1.
prtesti 19336 0.3607 84 0.3571

*overall sample vs treat2.
prtesti 19336 0.3607 84 0.3571

*overall sample vs treat3.
prtesti 19336 0.3607 132 0.2652

*South.
*overall sample vs control.
prtesti 19336 0.2 72 0.2361

*overall sample vs treat1.
prtesti 19336 0.2 84 0.25

*overall sample vs treat2.
prtesti 19336 0.2 84 0.1310

*overall sample vs treat3.
prtesti 19336 0.2 132 0.1818

*West.
*overall sample vs control.
prtesti 19336 0.2445 72 0.3056

*overall sample vs treat1.
prtesti 19336 0.2445 84 0.1786

*overall sample vs treat2.
prtesti 19336 0.2445 84 0.3095

*overall sample vs treat3.
prtesti 19336 0.2445 132 0.3561

*Activities Analysis.

ta activitiesclean,missing
ta activitiesclean condition,missing col

*Difference in proportions test for activities.
*Missings.
*overall sample vs control missing.
prtesti 19336 0.5897 72 0.5714

*overall sample vs treat1 missing.
prtesti 19336 0.5897 84 0.5714

*overall sample vs treat2 missing.
prtesti 19336 0.5897 84 0.6071

*overall sample vs treat3 missing.
prtesti 19336 0.5897 132 0.4470

*Development.
*overall sample vs control .
prtesti 19336 0.0912 72 0.0833

*overall sample vs treat1 .
prtesti 19336 0.0912 84 0.0357

*overall sample vs treat2 .
prtesti 19336 0.0912 84 0.1071

*overall sample vs treat3 .
prtesti 19336 0.0912 132 0.1364


*EDUCATION.
*overall sample vs control .
prtesti 19336 0.0988 72 0.1528

*overall sample vs treat1 .
prtesti 19336 0.0988 84 0.1429

*overall sample vs treat2 .
prtesti 19336 0.0988 84 0.0595

*overall sample vs treat3 .
prtesti 19336 0.0988 132 0.1288

*HEALTH.
*overall sample vs control .
prtesti 19336 0.0423 72 0.0417

*overall sample vs treat1 .
prtesti 19336 0.0423 84 0.0357

*overall sample vs treat2 .
prtesti 19336 0.0423 84 0.0238

*overall sample vs treat3 .
prtesti 19336 0.0423 132 0.0606

*OTHER.
*overall sample vs control .
prtesti 19336 0.0791 72 0.1111

*overall sample vs treat1 .
prtesti 19336 0.0791 84 0.1310

*overall sample vs treat2 .
prtesti 19336 0.0791 84 0.1071

*overall sample vs treat3 .
prtesti 19336 0.0791 132 0.1212


*YOUTH/CHILDREN/FAMILY/ELDERLY/WOMEN.
*overall sample vs control .
prtesti 19336 0.0989 72 0.0694

*overall sample vs treat1 .
prtesti 19336 0.0989 84 0.0833

*overall sample vs treat2 .
prtesti 19336 0.0989 84 0.0952

*overall sample vs treat3 .
prtesti 19336 0.0989 132 0.1061
































